12 research outputs found

    Apparatus and method for decomposing an audio signal using a variable threshold

    No full text
    An apparatus for decomposing an audio signal into a background component signal and a foreground component signal, has: a block generator for generating a time sequence of blocks of audio signal values; an audio signal analyzer for determining a characteristic of a current block of the audio signal and for determining a variability of the characteristic within a group of blocks having at least two blocks of the sequence of blocks; and a separator for separating the current block into a background portion and a foreground portion wherein the separator is configured to determine a separation threshold based on the variability and to separate the current block into the background component signal and the foreground component signal, when the characteristic of the current block is in a predetermined relation to the separation threshold

    Apparatus and method for decomposing an audio signal using a ratio as a separation characteristic

    No full text
    An apparatus for decomposing an audio signal into a background component signal and a foreground component signal includes: a block generator for generating a time sequence of blocks of audio signal values; an audio signal analyzer for determining a block characteristic of a current block of the audio signal and for determining an average characteristic for a group of blocks, the group of blocks including at least two blocks; and a separator for separating the current block into a background portion and a foreground portion in response to a ratio of the block characteristic of the current block and the average characteristic of the group of blocks, wherein the background component signal includes the background portion of the current block and the foreground component signal includes the foreground portion of the current block

    Phase derivative correction of bandwidth-extended signals for perceptual audio codecs

    No full text
    Bandwidth extension methods, such as spectral band replication (SBR), are often used in low-bit-rate codecs. They allow transmitting only a relatively narrow low-frequency region alongside with parametric information about the higher bands. The signal for the higher bands is obtained by simply copying it from the transmitted low-frequency region. The copied-up signal is processed by multiplying the magnitude spectrum with suitable gains based on the transmitted parameters to obtain a similar magnitude spectrum as that of the original signal. However, the phase spectrum of the copied-up signal is typically not processed but is directly used. In this paper we describe what are the perceptual consequences of using directly the copied-up phase spectrum. Based on the observed effects, two metrics for detecting the perceptually most significant effects are proposed. Based on these, methods how to correct the phase spectrum are proposed as well as strategies for minimizing the amount of transmitted additional parameter values for performing the correction. Finally, the results of formal listening tests are presented

    Apparatus and method for harmonic-percussive-residual sound separation using a structure tensor on spectrograms

    No full text
    Apparatus and method for analysing a magnitude spectrogram of an audio signal for Harmonic-Percussive Residual Sound Separation HPSS comprising : Determining a change of a frequency for each time-frequency bin of a plurality of time-frequency bins of the magnitude spectrogram of the audio signal; classifying each time-frequency bin into a signal component group depending on the change of the frequency. A structural tensor is applied to the image of the spectogram for preprocessing or feature extraction by edge and corner detection, in particular by calculating predominant orientation angles in the spectrogram.The structure tensor can be considered a black box, where the input is a gray scale image and the outputs are angles n for each pixel corresponding to the direction of lowest change and a certainty or anisotropy measure for this direction for each pixel. A local frequency change is extracted from the angles : It can be determined, whether a time-frequency-bin in the spectrogram belongs to a harmonic component (= low local frequency change) or to a percussive component (= high or infinite local frequency change). Examples of application : (figure 1) Distinguish between harmonic, percussive, and residual signal components by employing this orientation information. (figure 5) Analyse an audio signal for upmixing to five audio output channels front left, center, right, left surround and right surround : - The harmonic weighting factor may be greater for generating the left, center and right output channels compared to the harmonic weighting factor for generating the left surround and right surround output channels. - The percussive weighting factor may be smaller for generating the left, center and right output channels compared to the percussive weighting factor for generating the left surround and right surround output channels. (figure 6) Compute source separation metrics (source to distortion ratio SDR, source to interference ratio SIR, and source to artifacts ratios SAR) in a recorded audio signal. For example : A vibrato in a singing voice has a high instantaneous frequency change rate; an assignment of a bin in the spectrogram to "residual" is dependent on the bin anisotropy

    Apparatus and method for encoding an audio signal using a compensation value

    No full text
    An apparatus for encoding an audio signal, comprises: a core encoder for core encoding first audio data in a first spectral band; a parametric coder for parametrically coding second audio data in a second spectral band being different from the first spectral band, wherein the parametric coder comprises: an analyzer for analyzing first audio data in the first spectral band to obtain a first analysis result and for analyzing second audio data in the second spectral band to obtain a second analysis result; a compensator for calculating a compensation value using the first analysis result and the second analysis result; and a parameter calculated for calculating a parameter from the second audio data in the second spectral band using the compensation value

    QMF based harmonic spectral band replication

    No full text
    Unified speech and audio coding (USAC) is the next step in the evolution of audio codecs that are standardized by the Moving Picture Experts Group (MPEG). USAC provides consistent quality for music, speech and mixed material, by extending the strength of an audio codec by speech codec functions. Two novel flavors of Spectral Band Replication (SBR) were introduced to enhance the perceptual quality of SBR: the Discrete Fourier Transform (DFT) based and the Quadrature Mirror Filterbank (QMF) harmonic SBR. The DFT-based SBR has higher frequency resolution for the harmonic transposition process, resulting in good sound quality. The QMF-based SBR has significantly lower computational complexity. This paper describes the detailed technical aspects of the low complexity QMF-based harmonic SBR tool within USAC. A complexity comparison and the listening test results are also presented in the paper
    corecore